Efficient Parallel And Serial Approximate String Matching

نویسندگان

Gad M. Landau

Uzi Vishkin

چکیده

Consider the string matching problem, where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern or a superfluous character in the text or a superfluous character in the pattern. Given a text of length n, a pattern of length m and an integer k, we present parallel and serial algorithms for finding all occurrences of the pattern in the text with at most k differences. The first part of the parallel algorithm consists of analysis of the pattern and takes O(log m) time using m 2 processors. The rest of the algorithm consists of handling the text. The text handling part applies the following new approach. This part starts by obtaining a concise characterization of the text which is based solely on substrings of the pattern in O(log m) time using n / log m processors. Then the desired output is derived from this characterization together with the tables built in the first part in O(k) time using n processors. The serial algorithm follows also this new approach for handling the text. It runs in O(kn) time for alphabet whose size is fixed. For general input the algorithm requires O(n(k + log m) ) time. In both cases the space requirement is O(n).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data structures and algorithms for approximate string matching

This paper surveys techniques for designing efficient sequential and parallel approximate string matching algorithms. Special attention is given to the methods for the construction of data structures that efficiently support primitive operations needed in approximate string matching.

متن کامل

Approximate Multiple Pattern String Matching using Bit Parallelism: A Review

String matching is to find all the occurrences of a given pattern in a large text both being sequence of characters drawn from finite alphabet set. Approximate String Matching involves the detection of correct patterns along with the detection of some wrong patterns inside the text. Bit Parallelism is a feature that can be used to detect patterns inside the text and is reported to result in mor...

متن کامل

A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches

This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with kmismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in comp...

متن کامل

Tighter Packed Bit-Parallel NFA for Approximate String Matching

We propose a new variant of the bit-parallel NFA of Baeza-Yates and Navarro (BPD) for approximate string matching [1]. Given a length-m pattern and an error threshold k, the original BPD uses (m−k)(k +2) bits of space. We decrease this to (m− k)(k +1), and also give a slightly more efficient simulation algorithm for the NFA. In experiments our modified NFA is often noticeably more efficient tha...

متن کامل

Families of FPGA-based accelerators for approximate string matching

Dynamic programming for approximate string matching is a large family of different algorithms, which vary significantly in purpose, complexity, and hardware utilization. Many implementations have reported impressive speed-ups, but have typically been point solutions - highly specialized and addressing only one or a few of the many possible options. The problem to be solved is creating a hardwar...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1986

Efficient Parallel And Serial Approximate String Matching

نویسندگان

چکیده

منابع مشابه

Data structures and algorithms for approximate string matching

Approximate Multiple Pattern String Matching using Bit Parallelism: A Review

A Parallel Algorithm for Fixed-Length Approximate String-Matching with k-mismatches

Tighter Packed Bit-Parallel NFA for Approximate String Matching

Families of FPGA-based accelerators for approximate string matching

عنوان ژورنال:

اشتراک گذاری